Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Sci Adv ; 9(40): eadi1480, 2023 10 06.
Artículo en Inglés | MEDLINE | ID: mdl-37801497

RESUMEN

Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency by introducing neural dynamics and spike properties. As the emerging spiking deep learning paradigm attracts increasing interest, traditional programming frameworks cannot meet the demands of the automatic differentiation, parallel computation acceleration, and high integration of processing neuromorphic datasets and deployment. In this work, we present the SpikingJelly framework to address the aforementioned dilemma. We contribute a full-stack toolkit for preprocessing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips. Compared to existing methods, the training of deep SNNs can be accelerated 11×, and the superior extensibility and flexibility of SpikingJelly enable users to accelerate custom models at low costs through multilevel inheritance and semiautomatic code generation. SpikingJelly paves the way for synthesizing truly energy-efficient SNN-based machine intelligence systems, which will enrich the ecology of neuromorphic computing.


Asunto(s)
Algoritmos , Neuronas , Redes Neurales de la Computación , Aprendizaje Automático , Inteligencia
2.
J Comput Neurosci ; 51(4): 475-490, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37721653

RESUMEN

Spiking neural networks (SNNs), as the third generation of neural networks, are based on biological models of human brain neurons. In this work, a shallow SNN plays the role of an explicit image decoder in the image classification. An LSTM-based EEG encoder is used to construct the EEG-based feature space, which is a discriminative space in viewpoint of classification accuracy by SVM. Then, the visual feature vectors extracted from SNN is mapped to the EEG-based discriminative features space by manifold transferring based on mutual k-Nearest Neighbors (Mk-NN MT). This proposed "Brain-guided system" improves the separability of the SNN-based visual feature space. In the test phase, the spike patterns extracted by SNN from the input image is mapped to LSTM-based EEG feature space, and then classified without need for the EEG signals. The SNN-based image encoder is trained by the conversion method and the results are evaluated and compared with other training methods on the challenging small ImageNet-EEG dataset. Experimental results show that the proposed transferring the manifold of the SNN-based feature space to LSTM-based EEG feature space leads to 14.25% improvement at most in the accuracy of image classification. Thus, embedding SNN in the brain-guided system which is trained on a small set, improves its performance in image classification.


Asunto(s)
Modelos Neurológicos , Redes Neurales de la Computación , Humanos , Encéfalo/fisiología , Neuronas/fisiología
3.
Front Neurosci ; 17: 1160034, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37250425

RESUMEN

Event-based cameras are raising interest within the computer vision community. These sensors operate with asynchronous pixels, emitting events, or "spikes", when the luminance change at a given pixel since the last event surpasses a certain threshold. Thanks to their inherent qualities, such as their low power consumption, low latency, and high dynamic range, they seem particularly tailored to applications with challenging temporal constraints and safety requirements. Event-based sensors are an excellent fit for Spiking Neural Networks (SNNs), since the coupling of an asynchronous sensor with neuromorphic hardware can yield real-time systems with minimal power requirements. In this work, we seek to develop one such system, using both event sensor data from the DSEC dataset and spiking neural networks to estimate optical flow for driving scenarios. We propose a U-Net-like SNN which, after supervised training, is able to make dense optical flow estimations. To do so, we encourage both minimal norm for the error vector and minimal angle between ground-truth and predicted flow, training our model with back-propagation using a surrogate gradient. In addition, the use of 3d convolutions allows us to capture the dynamic nature of the data by increasing the temporal receptive fields. Upsampling after each decoding stage ensures that each decoder's output contributes to the final estimation. Thanks to separable convolutions, we have been able to develop a light model (when compared to competitors) that can nonetheless yield reasonably accurate optical flow estimates.

4.
Front Neurosci ; 16: 971937, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36225737

RESUMEN

Spiking neural networks (SNNs) using time-to-first-spike (TTFS) codes, in which neurons fire at most once, are appealing for rapid and low power processing. In this theoretical paper, we focus on information coding and decoding in those networks, and introduce a new unifying mathematical framework that allows the comparison of various coding schemes. In an early proposal, called rank-order coding (ROC), neurons are maximally activated when inputs arrive in the order of their synaptic weights, thanks to a shunting inhibition mechanism that progressively desensitizes the neurons as spikes arrive. In another proposal, called NoM coding, only the first N spikes of M input neurons are propagated, and these "first spike patterns" can be readout by downstream neurons with homogeneous weights and no desensitization: as a result, the exact order between the first spikes does not matter. This paper also introduces a third option-"Ranked-NoM" (R-NoM), which combines features from both ROC and NoM coding schemes: only the first N input spikes are propagated, but their order is readout by downstream neurons thanks to inhomogeneous weights and linear desensitization. The unifying mathematical framework allows the three codes to be compared in terms of discriminability, which measures to what extent a neuron responds more strongly to its preferred input spike pattern than to random patterns. This discriminability turns out to be much higher for R-NoM than for the other codes, especially in the early phase of the responses. We also argue that R-NoM is much more hardware-friendly than the original ROC proposal, although NoM remains the easiest to implement in hardware because it only requires binary synapses.

5.
Front Neurosci ; 15: 727448, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34602970

RESUMEN

The early visual cortex is the site of crucial pre-processing for more complex, biologically relevant computations that drive perception and, ultimately, behaviour. This pre-processing is often studied under the assumption that neural populations are optimised for the most efficient (in terms of energy, information, spikes, etc.) representation of natural statistics. Normative models such as Independent Component Analysis (ICA) and Sparse Coding (SC) consider the phenomenon as a generative, minimisation problem which they assume the early cortical populations have evolved to solve. However, measurements in monkey and cat suggest that receptive fields (RFs) in the primary visual cortex are often noisy, blobby, and symmetrical, making them sub-optimal for operations such as edge-detection. We propose that this suboptimality occurs because the RFs do not emerge through a global minimisation of generative error, but through locally operating biological mechanisms such as spike-timing dependent plasticity (STDP). Using a network endowed with an abstract, rank-based STDP rule, we show that the shape and orientation tuning of the converged units are remarkably close to single-cell measurements in the macaque primary visual cortex. We quantify this similarity using physiological parameters (frequency-normalised spread vectors), information theoretic measures [Kullback-Leibler (KL) divergence and Gini index], as well as simulations of a typical electrophysiology experiment designed to estimate orientation tuning curves. Taken together, our results suggest that compared to purely generative schemes, process-based biophysical models may offer a better description of the suboptimality observed in the early visual cortex.

6.
Front Comput Neurosci ; 15: 658764, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34108870

RESUMEN

In recent years, event-based sensors have been combined with spiking neural networks (SNNs) to create a new generation of bio-inspired artificial vision systems. These systems can process spatio-temporal data in real time, and are highly energy efficient. In this study, we used a new hybrid event-based camera in conjunction with a multi-layer spiking neural network trained with a spike-timing-dependent plasticity learning rule. We showed that neurons learn from repeated and correlated spatio-temporal patterns in an unsupervised way and become selective to motion features, such as direction and speed. This motion selectivity can then be used to predict ball trajectory by adding a simple read-out layer composed of polynomial regressions, and trained in a supervised manner. Hence, we show that a SNN receiving inputs from an event-based sensor can extract relevant spatio-temporal patterns to process and predict ball trajectories.

7.
Neuron ; 109(4): 571-575, 2021 02 17.
Artículo en Inglés | MEDLINE | ID: mdl-33600754

RESUMEN

Recent research resolves the challenging problem of building biophysically plausible spiking neural models that are also capable of complex information processing. This advance creates new opportunities in neuroscience and neuromorphic engineering, which we discussed at an online focus meeting.


Asunto(s)
Ingeniería Biomédica/tendencias , Modelos Neurológicos , Redes Neurales de la Computación , Neurociencias/tendencias , Ingeniería Biomédica/métodos , Predicción , Humanos , Neuronas/fisiología , Neurociencias/métodos
8.
Int J Neural Syst ; 30(6): 2050027, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-32466691

RESUMEN

We propose a new supervised learning rule for multilayer spiking neural networks (SNNs) that use a form of temporal coding known as rank-order-coding. With this coding scheme, all neurons fire exactly one spike per stimulus, but the firing order carries information. In particular, in the readout layer, the first neuron to fire determines the class of the stimulus. We derive a new learning rule for this sort of network, named S4NN, akin to traditional error backpropagation, yet based on latencies. We show how approximated error gradients can be computed backward in a feedforward network with any number of layers. This approach reaches state-of-the-art performance with supervised multi-fully connected layer SNNs: test accuracy of 97.4% for the MNIST dataset, and 99.2% for the Caltech Face/Motorbike dataset. Yet, the neuron model that we use, nonleaky integrate-and-fire, is much simpler than the one used in all previous works. The source codes of the proposed S4NN are publicly available at https://github.com/SRKH/S4NN.


Asunto(s)
Potenciales de la Membrana/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Neuronas/fisiología , Aprendizaje Automático Supervisado , Humanos
9.
Front Neurosci ; 13: 625, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31354403

RESUMEN

Application of deep convolutional spiking neural networks (SNNs) to artificial intelligence (AI) tasks has recently gained a lot of interest since SNNs are hardware-friendly and energy-efficient. Unlike the non-spiking counterparts, most of the existing SNN simulation frameworks are not practically efficient enough for large-scale AI tasks. In this paper, we introduce SpykeTorch, an open-source high-speed simulation framework based on PyTorch. This framework simulates convolutional SNNs with at most one spike per neuron and the rank-order encoding scheme. In terms of learning rules, both spike-timing-dependent plasticity (STDP) and reward-modulated STDP (R-STDP) are implemented, but other rules could be implemented easily. Apart from the aforementioned properties, SpykeTorch is highly generic and capable of reproducing the results of various studies. Computations in the proposed framework are tensor-based and totally done by PyTorch functions, which in turn brings the ability of just-in-time optimization for running on CPUs, GPUs, or Multi-GPU platforms.

10.
Neural Netw ; 111: 47-63, 2019 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-30682710

RESUMEN

In recent years, deep learning has revolutionized the field of machine learning, for computer vision in particular. In this approach, a deep (multilayer) artificial neural network (ANN) is trained, most often in a supervised manner using backpropagation. Vast amounts of labeled training examples are required, but the resulting classification accuracy is truly impressive, sometimes outperforming humans. Neurons in an ANN are characterized by a single, static, continuous-valued activation. Yet biological neurons use discrete spikes to compute and transmit information, and the spike times, in addition to the spike rates, matter. Spiking neural networks (SNNs) are thus more biologically realistic than ANNs, and are arguably the only viable option if one wants to understand how the brain computes at the neuronal description level. The spikes of biological neurons are sparse in time and space, and event-driven. Combined with bio-plausible local learning rules, this makes it easier to build low-power, neuromorphic hardware for SNNs. However, training deep SNNs remains a challenge. Spiking neurons' transfer function is usually non-differentiable, which prevents using backpropagation. Here we review recent supervised and unsupervised methods to train deep SNNs, and compare them in terms of accuracy and computational cost. The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNNs typically require many fewer operations and are the better candidates to process spatio-temporal data.


Asunto(s)
Potenciales de Acción , Aprendizaje Profundo , Modelos Neurológicos , Redes Neurales de la Computación , Potenciales de Acción/fisiología , Algoritmos , Encéfalo/fisiología , Aprendizaje Profundo/tendencias , Humanos , Aprendizaje Automático/tendencias , Neuronas/fisiología
11.
Front Comput Neurosci ; 12: 74, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30279653

RESUMEN

Repeating spatiotemporal spike patterns exist and carry information. Here we investigated how a single spiking neuron can optimally respond to one given pattern (localist coding), or to either one of several patterns (distributed coding, i.e., the neuron's response is ambiguous but the identity of the pattern could be inferred from the response of multiple neurons), but not to random inputs. To do so, we extended a theory developed in a previous paper (Masquelier, 2017), which was limited to localist coding. More specifically, we computed analytically the signal-to-noise ratio (SNR) of a multi-pattern-detector neuron, using a threshold-free leaky integrate-and-fire (LIF) neuron model with non-plastic unitary synapses and homogeneous Poisson inputs. Surprisingly, when increasing the number of patterns, the SNR decreases slowly, and remains acceptable for several tens of independent patterns. In addition, we investigated whether spike-timing-dependent plasticity (STDP) could enable a neuron to reach the theoretical optimal SNR. To this aim, we simulated a LIF equipped with STDP, and repeatedly exposed it to multiple input spike patterns, embedded in equally dense Poisson spike trains. The LIF progressively became selective to every repeating pattern with no supervision, and stopped discharging during the Poisson spike trains. Furthermore, tuning certain STDP parameters, the resulting pattern detectors were optimal. Tens of independent patterns could be learned by a single neuron using a low adaptive threshold, in contrast with previous studies, in which higher thresholds led to localist coding only. Taken together these results suggest that coincidence detection and STDP are powerful mechanisms, fully compatible with distributed coding. Yet we acknowledge that our theory is limited to single neurons, and thus also applies to feed-forward networks, but not to recurrent ones.

12.
J Neurosci ; 38(44): 9563-9578, 2018 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-30242050

RESUMEN

Neural selectivity in the early visual cortex strongly reflects the statistics of our environment (Barlow, 2001; Geisler, 2008). Although this has been described extensively in literature through various encoding hypotheses (Barlow and Földiák, 1989; Atick and Redlich, 1992; Olshausen and Field, 1996), an explanation as to how the cortex might develop the computational architecture to support these encoding schemes remains elusive. Here, using the more realistic example of binocular vision as opposed to monocular luminance-field images, we show how a simple Hebbian coincidence-detector is capable of accounting for the emergence of binocular, disparity selective, receptive fields. We propose a model based on spike timing-dependent plasticity, which not only converges to realistic single-cell and population characteristics, but also demonstrates how known biases in natural statistics may influence population encoding and downstream correlates of behavior. Furthermore, we show that the receptive fields we obtain are closer in structure to electrophysiological data reported in macaques than those predicted by normative encoding schemes (Ringach, 2002). We also demonstrate the robustness of our model to the input dataset, noise at various processing stages, and internal parameter variation. Together, our modeling results suggest that Hebbian coincidence detection is an important computational principle and could provide a biologically plausible mechanism for the emergence of selectivity to natural statistics in the early sensory cortex.SIGNIFICANCE STATEMENT Neural selectivity in the early visual cortex is often explained through encoding schemes that postulate that the computational aim of early sensory processing is to use the least possible resources (neurons, energy) to code the most informative features of the stimulus (information efficiency). In this article, using stereo images of natural scenes, we demonstrate how a simple Hebbian rule can lead to the emergence of a disparity-selective neural population that not only shows realistic single-cell and population tunings, but also demonstrates how known biases in natural statistics may influence population encoding and downstream correlates of behavior. Our approach allows us to view early neural selectivity, not as an optimization problem, but as an emergent property driven by biological rules of plasticity.


Asunto(s)
Redes Neurales de la Computación , Plasticidad Neuronal/fisiología , Disparidad Visual/fisiología , Visión Binocular/fisiología , Corteza Visual/fisiología , Bases de Datos Factuales , Humanos
13.
IEEE Trans Neural Netw Learn Syst ; 29(12): 6178-6190, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-29993898

RESUMEN

Reinforcement learning (RL) has recently regained popularity with major achievements such as beating the European game of Go champion. Here, for the first time, we show that RL can be used efficiently to train a spiking neural network (SNN) to perform object recognition in natural images without using an external classifier. We used a feedforward convolutional SNN and a temporal coding scheme where the most strongly activated neurons fire first, while less activated ones fire later, or not at all. In the highest layers, each neuron was assigned to an object category, and it was assumed that the stimulus category was the category of the first neuron to fire. If this assumption was correct, the neuron was rewarded, i.e., spike-timing-dependent plasticity (STDP) was applied, which reinforced the neuron's selectivity. Otherwise, anti-STDP was applied, which encouraged the neuron to learn something else. As demonstrated on various image data sets (Caltech, ETH-80, and NORB), this reward-modulated STDP (R-STDP) approach has extracted particularly discriminative visual features, whereas classic unsupervised STDP extracts any feature that consistently repeats. As a result, R-STDP has outperformed STDP on these data sets. Furthermore, R-STDP is suitable for online learning and can adapt to drastic changes such as label permutations. Finally, it is worth mentioning that both feature extraction and classification were done with spikes, using at most one spike per neuron. Thus, the network is hardware friendly and energy efficient.


Asunto(s)
Modelos Neurológicos , Plasticidad Neuronal/fisiología , Neuronas/fisiología , Recompensa , Percepción Visual/fisiología , Animales , Simulación por Computador , Humanos , Red Nerviosa
14.
Neural Netw ; 105: 294-303, 2018 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-29894846

RESUMEN

Although representation learning methods developed within the framework of traditional neural networks are relatively mature, developing a spiking representation model remains a challenging problem. This paper proposes an event-based method to train a feedforward spiking neural network (SNN) layer for extracting visual features. The method introduces a novel spike-timing-dependent plasticity (STDP) learning rule and a threshold adjustment rule both derived from a vector quantization-like objective function subject to a sparsity constraint. The STDP rule is obtained by the gradient of a vector quantization criterion that is converted to spike-based, spatio-temporally local update rules in a spiking network of leaky, integrate-and-fire (LIF) neurons. Independence and sparsity of the model are achieved by the threshold adjustment rule and by a softmax function implementing inhibition in the representation layer consisting of WTA-thresholded spiking neurons. Together, these mechanisms implement a form of spike-based, competitive learning. Two sets of experiments are performed on the MNIST and natural image datasets. The results demonstrate a sparse spiking visual representation model with low reconstruction loss comparable with state-of-the-art visual coding approaches, yet our rule is local in both time and space, thus biologically plausible and hardware friendly.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Retroalimentación , Modelos Neurológicos , Vías Visuales/fisiología
15.
Front Neuroinform ; 12: 9, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29563867

RESUMEN

We developed Convis, a Python simulation toolbox for large scale neural populations which offers arbitrary receptive fields by 3D convolutions executed on a graphics card. The resulting software proves to be flexible and easily extensible in Python, while building on the PyTorch library (The Pytorch Project, 2017), which was previously used successfully in deep learning applications, for just-in-time optimization and compilation of the model onto CPU or GPU architectures. An alternative implementation based on Theano (Theano Development Team, 2016) is also available, although not fully supported. Through automatic differentiation, any parameter of a specified model can be optimized to approach a desired output which is a significant improvement over e.g., Monte Carlo or particle optimizations without gradients. We show that a number of models including even complex non-linearities such as contrast gain control and spiking mechanisms can be implemented easily. We show in this paper that we can in particular recreate the simulation results of a popular retina simulation software VirtualRetina (Wohrer and Kornprobst, 2009), with the added benefit of providing (1) arbitrary linear filters instead of the product of Gaussian and exponential filters and (2) optimization routines utilizing the gradients of the model. We demonstrate the utility of 3d convolution filters with a simple direction selective filter. Also we show that it is possible to optimize the input for a certain goal, rather than the parameters, which can aid the design of experiments as well as closed-loop online stimulus generation. Yet, Convis is more than a retina simulator. For instance it can also predict the response of V1 orientation selective cells. Convis is open source under the GPL-3.0 license and available from https://github.com/jahuth/convis/ with documentation at https://jahuth.github.io/convis/.

16.
Neural Netw ; 99: 56-67, 2018 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-29328958

RESUMEN

Previous studies have shown that spike-timing-dependent plasticity (STDP) can be used in spiking neural networks (SNN) to extract visual features of low or intermediate complexity in an unsupervised manner. These studies, however, used relatively shallow architectures, and only one layer was trainable. Another line of research has demonstrated - using rate-based neural networks trained with back-propagation - that having many layers increases the recognition robustness, an approach known as deep learning. We thus designed a deep SNN, comprising several convolutional (trainable with STDP) and pooling layers. We used a temporal coding scheme where the most strongly activated neurons fire first, and less activated neurons fire later or not at all. The network was exposed to natural images. Thanks to STDP, neurons progressively learned features corresponding to prototypical patterns that were both salient and frequent. Only a few tens of examples per category were required and no label was needed. After learning, the complexity of the extracted features increased along the hierarchy, from edge detectors in the first layer to object prototypes in the last layer. Coding was very sparse, with only a few thousands spikes per image, and in some cases the object category could be reasonably well inferred from the activity of a single higher-order neuron. More generally, the activity of a few hundreds of such neurons contained robust category information, as demonstrated using a classifier on Caltech 101, ETH-80, and MNIST databases. We also demonstrate the superiority of STDP over other unsupervised techniques such as random crops (HMAX) or auto-encoders. Taken together, our results suggest that the combination of STDP with latency coding may be a key to understanding the way that the primate visual system learns, its remarkable processing speed and its low energy consumption. These mechanisms are also interesting for artificial vision systems, particularly for hardware solutions.


Asunto(s)
Potenciales de Acción/fisiología , Redes Neurales de la Computación , Plasticidad Neuronal/fisiología , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa/métodos , Animales , Simulación por Computador/tendencias , Humanos , Aprendizaje/fisiología , Modelos Neurológicos , Neuronas/fisiología , Percepción Visual/fisiología
17.
Neuroscience ; 389: 133-140, 2018 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-28668487

RESUMEN

Repeating spatiotemporal spike patterns exist and carry information. How this information is extracted by downstream neurons is unclear. Here we theoretically investigate to what extent a single cell could detect a given spike pattern and what the optimal parameters to do so are, in particular the membrane time constant τ. Using a leaky integrate-and-fire (LIF) neuron with homogeneous Poisson input, we computed this optimum analytically. We found that a relatively small τ (at most a few tens of ms) is usually optimal, even when the pattern is much longer. This is somewhat counter-intuitive as the resulting detector ignores most of the pattern, due to its fast memory decay. Next, we wondered if spike-timing-dependent plasticity (STDP) could enable a neuron to reach the theoretical optimum. We simulated a LIF equipped with additive STDP, and repeatedly exposed it to a given input spike pattern. As in previous studies, the LIF progressively became selective to the repeating pattern with no supervision, even when the pattern was embedded in Poisson activity. Here we show that, using certain STDP parameters, the resulting pattern detector is optimal. These mechanisms may explain how humans learn repeating sensory sequences. Long sequences could be recognized thanks to coincidence detectors working at a much shorter timescale. This is consistent with the fact that recognition is still possible if a sound sequence is compressed, played backward, or scrambled using 10-ms bins. Coincidence detection is a simple yet powerful mechanism, which could be the main function of neurons in the brain.


Asunto(s)
Potenciales de Acción/fisiología , Aprendizaje/fisiología , Modelos Neurológicos , Plasticidad Neuronal/fisiología , Neuronas/fisiología , Simulación por Computador , Factores de Tiempo
18.
Front Psychol ; 8: 1261, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28790954

RESUMEN

The human visual system contains a hierarchical sequence of modules that take part in visual perception at different levels of abstraction, i.e., superordinate, basic, and subordinate levels. One important question is to identify the "entry" level at which the visual representation is commenced in the process of object recognition. For a long time, it was believed that the basic level had a temporal advantage over two others. This claim has been challenged recently. Here we used a series of psychophysics experiments, based on a rapid presentation paradigm, as well as two computational models, with bandpass filtered images of five object classes to study the processing order of the categorization levels. In these experiments, we investigated the type of visual information required for categorizing objects in each level by varying the spatial frequency bands of the input image. The results of our psychophysics experiments and computational models are consistent. They indicate that the different spatial frequency information had different effects on object categorization in each level. In the absence of high frequency information, subordinate and basic level categorization are performed less accurately, while the superordinate level is performed well. This means that low frequency information is sufficient for superordinate level, but not for the basic and subordinate levels. These finer levels rely more on high frequency information, which appears to take longer to be processed, leading to longer reaction times. Finally, to avoid the ceiling effect, we evaluated the robustness of the results by adding different amounts of noise to the input images and repeating the experiments. As expected, the categorization accuracy decreased and the reaction time increased significantly, but the trends were the same. This shows that our results are not due to a ceiling effect. The compatibility between our psychophysical and computational results suggests that the temporal advantage of the superordinate (resp. basic) level to basic (resp. subordinate) level is mainly due to the computational constraints (the visual system processes higher spatial frequencies more slowly, and categorization in finer levels depends more on these higher spatial frequencies).

19.
Neurophotonics ; 4(3): 031222, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28680907

RESUMEN

Increasing evidence suggests that sensory stimulation not only changes the level of cortical activity with respect to baseline but also its structure. Despite having been reported in a multitude of conditions and preparations (for instance, as a quenching of intertrial variability, Churchland et al., 2010), such changes remain relatively poorly characterized. Here, we used optical imaging of voltage-sensitive dyes to explore, in V4 of an awake macaque, the spatiotemporal characteristics of both visually evoked and spontaneously ongoing neuronal activity and their difference. With respect to the spontaneous case, we detected a reduction in large-scale activity ([Formula: see text]) in the alpha range (5 to 12.5 Hz) during sensory inflow accompanied by a decrease in pairwise correlations. Moreover, the spatial patterns of correlation obtained during the different visual stimuli were on the average more similar one to another than they were to that obtained in the absence of stimulation. Finally, these observed changes in activity dynamics approached saturation already at very low stimulus contrasts, unlike the progressive, near-linear increase of the mean raw evoked responses over a wide range of contrast values, which could indicate a specific switching in the presence of a sensory inflow.

20.
Front Comput Neurosci ; 10: 92, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27642281

RESUMEN

View-invariant object recognition is a challenging problem that has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g., 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best models for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition task using the same set of images and controlling the kinds of transformation (position, scale, rotation in plane, and rotation in depth) as well as their magnitude, which we call "variation level." We used four object categories: car, ship, motorcycle, and animal. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs (proposed respectively by Hinton's group and Zisserman's group) on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position (much easier). This suggests that DCNNs would be reasonable models of human feed-forward vision. In addition, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...